You can run and edit these examples interactively on Galaxy
Fetch paginated data from the MGnify API, and save it as a CSV file
The MGnify API returns paginated data. When you list data, it comes to you in pages, or chunks. You have to request each page in turn. The jsonapi_client package can do this for you, automatically.
This example shows you how to download a paginated list of data and save it to a CSV table file
You can find all of the other “API endpoints” using the Browsable API interface in your web browser. The URL you see in the browsable API is exactly the same as the one you can use in this code.
This is an interactive code notebook (a Jupyter Notebook). To run this code, click into each cell and press the ▶ button in the top toolbar, or press shift+enter.
We pick an API endpoint for the kind of data to download:
from lib.variable_utils import get_variable_from_link_or_input# You can also just directly set the api_endpoint variable in code, like this:# api_endpoint = 'super-studies'api_endpoint = get_variable_from_link_or_input('API_ENDPOINT', 'API Endpoint', 'super-studies')
Using API Endpoint super-studies from the link you followed.
Using "super-studies" as API Endpoint
Use jsonapi_client to go through the paginated data. Note that this may take quite a long for long lists, because the API automatically slows down your connection if you request a lot of data. This keeps the service working well for everybody else.
We use pandas, an excellent library for data analysis, to normalise the data into a table.
from jsonapi_client import Sessionimport pandas as pdwith Session("https://www.ebi.ac.uk/metagenomics/api/v1") as mgnify: resources =map(lambda r: r.json, mgnify.iterate(api_endpoint)) resources = pd.json_normalize(resources) resources.to_csv(f"{api_endpoint}.csv")resources